Systems and methods for granular resource management in a storage network

ABSTRACT

In accordance with some aspects of the present invention, systems and methods are provided for dynamically and/or automatically selecting and/or modifying data path definitions that are used in performing storage operations on data. Alternate data paths may be specified or selected that use some or all resources that communicate with a particular destination to improve system reliability and performance. The system may also dynamically monitor and choose data path definitions to optimize system performance, conserve storage media and promote balanced load distribution.

RELATED APPLICATIONS

Any and all applications for which a foreign or domestic priority claimis identified in the Application Data Sheet, or any correction thereto,are hereby incorporated by reference into this application under 37 CFR1.57.

This application is also related to the following patents and pendingapplications, each of which is hereby incorporated herein by referencein its entirety:

U.S. Pat. No. 6,418,478, titled PIPELINED HIGH SPEED DATA TRANSFERMECHANISM, issued Jul. 9, 2002;

U.S. Pat. No. 7,035,880 titled MODULAR BACKUP AND RETRIEVAL SYSTEM USEDIN CONJUNCTION WITH A STORAGE AREA NETWORK, filed Jul. 6, 2000;

U.S. Pat. No. 6,542,972 titled LOGICAL VIEW AND ACCESS TO PHYSICALSTORAGE IN MODULAR DATA AND STORAGE MANAGEMENT SYSTEM;

U.S. patent application Ser. No. 10/658,095 titled DYNAMIC STORAGEDEVICE POOLING IN A COMPUTER SYSTEM, filed Sep. 9, 2003, now U.S. Pat.No. 7,130,970, issued Oct. 31, 2006;

U.S. patent application Ser. No. 10/818,749, titled SYSTEM AND METHODFOR DYNAMICALLY PERFORMING STORAGE OPERATIONS IN A COMPUTER NETWORK,filed Apr. 5, 2004, now U.S. Pat. No. 7,246,207, issued Jul. 17, 2007;

U.S. patent application Ser. No. 11/120,619, titled HIERARCHICAL SYSTEMSAND METHODS FOR PROVIDING A UNIFIED VIEW OF STORAGE INFORMATION, filedMay 2, 2005, now U.S. Pat. No. 7,343,453, issued Mar. 11, 2008;

U.S. Provisional Application No. 60/752,203, titled SYSTEMS AND METHODSFOR CLASSIFYING AND TRANSFERRING INFORMATION IN A STORAGE NETWORK, filedDec. 19, 2005;

U.S. application Ser. No. 11/313,224 titled SYSTEMS AND METHODS FORPERFORMING MULTI-PATH STORAGE OPERATIONS, filed Dec. 19, 2005, now U.S.Pat. No. 7,620,710, issued Nov. 17, 2009;

U.S. Provisional Application No. 60/752,196 titled SYSTEMS AND METHODSFOR MIGRATING COMPONENTS ON A HIERARCHICAL STORAGE NETWORK, filed Dec.19, 2005;

U.S. Provisional Application No. 60/752,202 titled SYSTEMS AND METHODSFOR UNIFIED RECONSTRUCTION OF DATA IN A STORAGE NETWORK, filed Dec. 19,2005;

U.S. Provisional Application No. 60/752,201 titled SYSTEMS AND METHODSFOR RESYNCHRONIZING STORAGE OPERATIONS, filed Dec. 19, 2005; and

U.S. Provisional Application Ser. No. 60/752,197 titled SYSTEMS ANDMETHODS FOR HIERARCHICAL CLIENT GROUP MANAGEMENT, filed Dec. 19, 2005.

BACKGROUND OF THE INVENTION Field of the Invention

The inventions disclosed herein relate generally to performing storageoperations on electronic data in a computer network. More particularly,aspects of the present invention relate to data transmission schemesused during a storage operation including data pathways and othercomponents used in the transfer of data.

Over time, storage of electronic data has evolved through many forms.During the early development of the computer, data storage was limitedto individual computers. Electronic data was stored in the Random AccessMemory (RAM) or some other storage medium such as a hard drive or tapedrive that was an actual physical part of the individual computer.

Later, with the advent of network computing, storage of electronic datagradually migrated from individual computers to stand-alone storagedevices accessible via a network. Over time, these individual networkstorage devices evolved into more complex systems including networks oftape drives, optical libraries, Redundant Arrays of Inexpensive Disks(RAID), CD-ROM jukeboxes, and other devices. Common architecturesincluded drive pools, which generally are logical collections of driveswith associated media groups including the tapes or other storage mediaused by a given drive pool.

Serial, parallel, Small Computer System Interface (SCSI), or othercables directly connect such stand-alone storage devices to individualcomputers that are part of a network of other computers such as a LocalArea Network (LAN) or a Wide Area Network (WAN). Generally, eachindividual computer on the network controlled the storage devices thatwere physically attached to that computer and could also access thestorage devices of the other network computers to perform backups,transaction processing, file sharing, and other storage-relatedoperations.

Network Attached Storage (NAS) is another storage scheme usingstand-alone storage devices in a LAN or other such network. In NAS, astorage controller computer typically controls the storage device to theexclusion of other computers on the network, but the SCSI or othercabling directly connecting that storage device to the individualcontroller is eliminated. Instead, storage devices are directly attachedto the network itself.

Yet another network storage scheme is modular storage architecture whichis more fully described in U.S. Pat. No. 7,035,880 and U.S. Pat. No.6,542,268. An example of such a software application is the Galaxy™system, by CommVault Systems of Oceanport, N.J. The Galaxy™ system is amulti-tiered storage management solution which includes, among othercomponents, a storage manager, one or more media agents, and one or morestorage devices. The storage manager directs storage operations ofclient data to storage devices such magnetic and optical medialibraries. Media agents are storage controller computers that serve asintermediary devices managing the flow of data from client informationstores to individual storage devices. Each storage device may beuniquely associated with a particular media agent and this associationmay be tracked by the storage manager.

A common feature shared by all of the above-described networkarchitectures is the substantially static relationship between storagecontroller computers and storage devices. In these traditional networkarchitectures, storage devices are generally connected, virtually orphysically, to a single storage controller computer. Generally, only thestorage controller computer to which a particular device is physicallyconnected has read/write access to that device. One computer typicallycannot control the drive pool and media group be that is beingcontrolled by another. Requests to store and retrieve data from such adrive pool and media group would have to be coordinated by thecontrolling computer. Typically, storage media reserved or being writtento by one media agent cannot be written to be another media agent. Thus,often storage media being used pursuant to one storage policy cannot beused by another storage policy and vice versa often resulting in theinefficient use of storage resources.

In some prior art systems, storage policies may specify alternate datapaths or resources in the case device failure or an otherwiseunavailable data path. However, such systems typically specify a singlealternate data path. Moreover, because backup operations are traditionalperformed on a client by client basis, each client may store informationon different media, resulting in inefficient media use. Furthermore, inmany systems, failover conditions often result in the use of additionalmedia further resulting in inefficient use of resources. In addition,alternate data paths are defined in a static fashion, and thusconventional data protection schemes are unable to adapt to changingnetwork conditions.

SUMMARY OF THE INVENTION

In accordance with certain aspects of the present invention, systems andmethods are provided for dynamically or automatically selecting and/ormodifying data path definitions that are used in performing storageoperations. Alternate data paths may be specified or selected that usesome or all resources that communicate with a particular destination toimprove system reliability and performance. The system may alsodynamically monitor and choose data path definitions to optimize systemperformance, conserve storage media, prevent resource exhaustion andpromote balanced load distribution.

In one illustrative embodiment, a method for configuring a storageoperation system includes defining a first storage operation path to beused in performing a storage operation. The first storage operation pathmay specify a destination and substantially all of the resources capableof communicating with the destination. The system may define a secondstorage operation path used in the storage operation when the firststorage path is unavailable.

In an alternate embodiment, a storage operation system may include amanagement module for controlling or coordinating a storage operation toa destination, a plurality of storage devices, and at least two storageoperation paths linking a client to one or more storage devices. Thefirst storage operation path may specify many, most or substantially allof the resources capable of communicating with the destination, whilethe second storage operation path may be used in the storage operationwhen the first storage path is unavailable.

In yet another embodiment, a method for consolidating storage policieswithin a storage operation network is provided which may include,analyzing storage operation paths, which may defined in storagepolicies. Determining whether any of the storage operation paths havecommon element points and consolidating two or more of the storagepolicies having at least one common element into a single storageoperation policy such that the single storage operation policy supportscopy operations to or with the common element point such as a commondestination.

Another embodiment includes a system for consolidating storage policieswithin a storage operation network. The system may include a managementmodule for directing a storage operation to a destination, a pluralityof storage devices and a plurality of storage operation paths. Thestorage operation paths may be defined within a plurality of storagepolicies and have a series of element points defining locations orresources along the path, ending with the destination. The managementmodule may consolidate two or more of the storage policies having atleast one common element point into a single storage policy such thatthe single storage policy supports copy operations to the common elementpoint.

One embodiment of the present invention includes a method forconsolidating storage policies within a storage operation network thatincludes analyzing a plurality of storage operation paths that aredefined in storage policies. This may involve identifying certaininefficiencies in the storage operation paths and reconfiguring thestorage operation paths to improve system performance. This may furtherinvolve monitoring the storage network for the inefficiencies in thestorage network subsequent to redefining the plurality of storageoperation paths to determine whether the reconfiguration has achievedthe desired effect.

Another embodiment of the present invention includes a system forconsolidating storage policies within a storage operation network. Thesystem may include a management component for controlling orcoordinating a storage operation to a destination using one of aplurality of storage operation paths defined within a plurality ofstorage policies. The management component may identify inefficienciesin the storage operation paths and reconfigure or redefine the storageoperation paths to correct or improve or the modified inefficiencies.The management component may also monitor the storage network includingany reconfigurations subsequent to redefining the storage operationpaths to determine whether the reconfigurations provided the desiredcorrect or improvement. If, not, additional analysis and reconfigurationmay be performed.

BRIEF DESCRIPTION OF THE DRAWINGS

Aspects of the invention are illustrated in the figures of theaccompanying drawings which are meant to be exemplary and not limiting,in which like references are intended to refer to like or correspondingparts, and in which:

FIG. 1 is a block diagram of a network architecture for a system toperform storage operations on electronic data in a computer networkaccording to an embodiment of the invention;

FIG. 2 is a block diagram of an exemplary media storage device forperforming storage operations on electronic data in a computer networkaccording to an embodiment of the invention;

FIG. 3 is a flow chart illustrating some of the steps of a storageoperation in accordance with an embodiment of the invention;

FIG. 4 is a flow chart illustrating some steps of assigning storagepolicies to system resources and evaluating existing storage policies ofin accordance with an embodiment of the invention; and

FIG. 5 is a flow chart illustrating some of the steps of a method ofdynamically analyzing and managing storage policies and data paths inaccordance with an embodiment of the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

Detailed embodiments of the present invention are disclosed herein,however, it is to be understood that the disclosed embodiments aremerely exemplary of the invention, which may be embodied in variousforms. Therefore, specific functional details disclosed herein shall notbe interpreted as limiting, but merely as a basis for teaching oneskilled in the art to employ the present invention in any specificembodiment.

With reference to FIGS. 1 through 5, representative embodiments of theinvention are presented. Turning now to FIG. 1, a block diagram of onenetwork architecture suitable for performing storage operations onelectronic data in a computer network according to an embodiment of theinvention is shown. The embodiment, as shown, may include a storagemanagement component such as manager 100 and one or more of thefollowing: a client 85, an information store 90, a data agent 95, amedia agent 105, an index cache 110, and a storage device 115. Thesystem and elements thereof are exemplary of a three-tier backup systemsuch as the CommVault Galaxy™ backup system, available from CommVaultSystems, Inc. of Oceanport, N.J., and further described in U.S. Pat. No.7,035,880 which is incorporated herein by reference in its entirety.

A data agent 95 is generally a software module that responsible forarchiving, migrating, and recovering data of a client computer 85 storedin an information store 90 or other memory location. Each clientcomputer 85 may have one or more data agent(s) 95 and the system cansupport multiple client computers 85. The system may include a pluralityof data agents 95 each of which is intended to backup, migrate, andrecover data associated with a different application. For example,different individual data agents 95 may be designed to handle MicrosoftExchange® data, Lotus Notes® data, Microsoft Windows 2000® file systemdata, Microsoft Active Directory Objects® data, and other types of dataknown in the art.

In the case where a client computer 85 has two or more types of data, adedicated data agent 95 may be used for each data type to archive,migrate, and restore the client computer 85 data. For example, tobackup, migrate, and restore all of the data on a Microsoft Exchange2000® server, the client computer 85 would use one Microsoft Exchange2000® Mailbox data agent 95 to backup the Exchange 2000® mailboxes, oneMicrosoft Exchange 2000® Database data agent 95 to backup the Exchange2000® databases, one Microsoft Exchange 2000® Public Folder data agent95 to backup the Exchange 2000® Public Folders, and one MicrosoftWindows 2000® File System data agent 95 to backup the client computer's85 file system. These data agents 95 would be treated as four separatedata agents 95 by the system even though they reside on the same clientcomputer 85.

In some embodiments, however, multipurpose or generic data agents, maybe used that operate on multiple data without types. For example, onedata agent may operate on Microsoft Exchange 2000® Mailbox and MicrosoftWindows 2000® File System data, etc.

Storage manager 100, in one embodiment, may be implemented as a softwaremodule or application that coordinates and controls various aspects ofthe system shown in FIG. 1. For example, storage manager 100 maycommunicate with some or all elements of the system including clientcomputers 85, data agents 95, media agents 105, and storage devices 115,to schedule, initiate, manage and coordinate system backups, migrations,and data recoveries.

In one embodiment, a media agent 105 is may be implemented as a softwaremodule that conducts data, as directed by storage manager 100, betweenthe client computer 85 and one or more storage devices 115 such as atape library, a magnetic media storage device, an optical media storagedevice, or other storage device known in the art. For example, as shownin FIG. 1, storage manager 100 may direct data agents 95 to copy datafrom one or more clients 85 to storage device 115 through media agents105. In some embodiments, media agent 105 communicates with and controlsthe storage device 115.

For example, media agent 105 may instruct storage device 115 to use arobotic arm or other means to load or eject a media cartridge, toarchive, migrate, or restore data to or from certain media present indevice 115. Media agents 105 may also communicate with the storagedevices 115 via a local bus such as a SCSI adaptor, or other suitableconnection means. In other implementations, storage device 115 maycommunicate to the data agent 105 via a Storage Area Network (“SAN”).

Each media agent 105 may maintain an index cache 110 which stores theindex data the system generates during backup, migration, and restorestorage operations as further described herein. For example, storageoperations for Microsoft Exchange® data generate index data containingthe location and other information such as metadata regarding the dataon the storage device 145 the Exchange data is stored on.

Index data provides the system with an efficient mechanism for locatinguser files or data for recovery operations. This index data is generallystored with the data backed up to the storage device 115. The mediaagent 105 that controls the storage operation may also write anadditional copy of the index data to its index cache 110. The data inmedia agent 105 and index cache 110 is thus readily available to thesystem for use (in storage and retrieval operations and otheractivities) without having to be first retrieved from a storage device115.

Storage manager 100 also maintains an index cache 110. Such index datamay include logical associations between components of the system, userpreferences, metadata regarding application data or user preferences,management tasks, and other useful data. For example, the storagemanager 100 may use its index cache 110 to track the logicalassociations between media agents 105 and storage devices 115.

Index caches 110 typically reside on their corresponding storagecomponent's hard disk or other fixed storage device. Like any cache, theindex cache 110 has finite capacity and the amount of index data thatcan be maintained directly corresponds to the size of that portion ofthe disk that is allocated to the index cache 110. In one embodiment,the system may manage the index cache 110 on a least recently used(“LRU”) basis as known in the art. When the capacity of the index cache110 is reached, the system may overwrite those files in the index cache110 that have been least recently used with the new index data. In someembodiments, before data in the index cache 110 is overwritten, the datamay be copied to an index cache copy and stored on a storage device 115.If a recovery operation requires index data that is no longer stored inthe index cache 110, such as in the case of a cache miss, the system mayrecover the index data from the copy stored in storage device 115.

In some embodiments, components of the system may reside and execute onthe same computer. In alternative embodiments, a client computer 85component such as a data agent 95, a media agent 105, or a storagemanager 100 may coordinate and direct local archiving, migration, andretrieval of application functions as further described in U.S. Pat. No.7,035,880. Thus, client computer 85 component can function independentlyor together with other similar client computer 85 components.

Turning now to FIG. 2, a block diagram of an exemplary media librarystorage device 120 for performing storage operations on electronic datain a computer network according to an embodiment of the invention ispresented. Media library device 120 represents one specific type ofstorage device 115 (FIG. 1) that may be used with an implementation ifthe invention.

Media library storage device 120 may contain any suitable magnetic,optical or other storage media 145 and associated drives 125, 130, 135,and 140. Media 145 may store electronic data containing backups ofapplication data, user preferences, metadata, system information, andother useful information known in the art. Drives 125, 130, 135 and 140are used to store and retrieve electronic data from media 145. In oneembodiment, drives 125, 130, 135 and 140 may function as a drive pool,as further described in application Ser. No. 10/658,095 which is herebyincorporated herein by reference in its entirety. A drive pool is alogical concept that associates drives and storage media with a storagepolicy and a source device such as a client 85. Storage policiesrepresenting storage patterns and preferences are more fully discussedin U.S. Pat. No. 6,542,972 which is hereby incorporated by referenceherein in its entirety.

A drive pool may be identified by a set of drives within a librarystorage device 120 as pointed to by one or more media agents 105. Forexample, a drive pool known as DP1 consisting of drives 125 and 130 inlibrary 120 known as LIB1 may be associated by a storage policy, with afirst media agent 105 MA1 in an index cache 110 entry as follows:LIB1/MA1/DP1. A second drive pool consisting of drives 130, 135, and 140within the library storage device 120 associated with the same mediaagent 105 may be expressed in index cache 110 as follows: LIB1/MA1/DP2.

As further described herein, the present invention permits logicalassociation of drive pools associated with different media agents 105(FIG. 1). Multiple drive pools, media agents, and other systemcomponents can be associated in a single index cache 110 entry. Thus,for example, an index cache 110 entry for a storage policy, according toan embodiment of the present invention, may combine the two previousentries instead and thus may be logically represented as: ##STR1##

In addition and as further described herein, media 145 may be associatedby the system with drive pools or storage policies, and not necessarilywith individual drives 125, 130, 135 and 140. A media group may be acollection of media 145 or other storage media assigned to a specificstorage policy. The media group may dynamically point to different drivepools as further described herein, including those with differentrecording formats as the system may update the recording format of themedia group in a media group table stored in an index cache 110.

Aspects of the present invention, as further described herein, permitdata associated with a particular storage policy copy to be stored onand share certain media 145. Data from each storage policy copy may beappended to media 145 shared by other storage policy copies. Thus, astorage policy copy may be shared between several media agents 105 in adynamic drive pooling environment with media 145 also being shared bythe different media agents 105 and storage policies. Media 145 can belocated in virtually any storage device 115 and for a given storagepolicy copy may be spread across multiple storage devices 115. Thus, anindex cache entry may associate multiple media sets 145 with multiplemedia agents, storage policies, drive pools, and other systemcomponents. For example, two different media sets from the previousexample of index entries might be associated in a single index cache 110entry as follows: TABLE-US-00001 storage policy1:media agent1:drivepool1:media set1:media set2 storage policy2:media agent2:drivepool2:media set1:media set2.

In addition to media sets, a single index cache 110 entry may alsospecify and associate multiple media agents 105, storage policies, drivepools, network pathways, and other components.

While the embodiments described above employ the use of two drive poolsand two media agents, one skilled in the art will recognize thatadditional media agents and logical drive pools may be implementedacross the storage policies without deviating from the scope and spiritof the present invention.

Tuning to FIG. 3, a flow chart 300 illustrating some of the stepsinvolved in performing storage operations on electronic data in acomputer network according to an embodiment of the invention is shown.Selection of desired storage components for storage operations may beperformed manually or automatically in dynamic fashion. In operation,the system may initiate a storage operation in response to a scheduledprocedure or as directed by a user, system administrator, or asotherwise directed by the system (step 310).

For example, the system may initiate a backup operation or a restoreoperation at a specific time of day or in response to a certainthreshold being exceeded as specified in a storage policy. The systemmay select a media agent 105 (FIG. 1) according to selection logic or aspecified data path as further described herein (step 320). Theselection logic and data paths may determined by a set of criteriadefined in the storage policies or according to system configuration oroperational rules or guidelines. Examples of such criteria may include,load balancing within the network, bandwidth use and efficiency, mediausage, available media space, etc.

In one illustrative embodiment, the selection logic includes the abilityto conduct a LAN-free storage operation, such as using a SAN, when it isdesired to optimize storage operations via load balancing. For example,an index entry in index cache 110 may associate certain media agents105, storage devices 115, or other components with LAN-free storageoperations either via user input, network topology detection algorithmsknown in the art, or other methods. As another example, the system mayselect a free media agent 105 to optimize storage operations via loadbalancing when a default media agent 105 or other media agent 105specified in a storage policy is already performing other storageoperations or otherwise occupied. The system may also select anappropriate drive pool in a network storage device according toselection logic further described herein (step 330). Once the system hasselected an appropriate media agent and drive pool, the storageoperation is performed, using the selected storage components (step340).

Another embodiment of the present invention allows storage policies tobe recognized or be defined in terms of sub-clients (e.g., processes orportions of data of a volume that are mutually exclusive) and have dataprotection operations performed at the sub-client level. For example, astorage policy may specify a path similar to those described above foreach sub-client operating on a client. In some embodiments, storagepolicies associated with each sub-client specify a default data path andone or more alternate data paths. These illustrative paths, in oneembodiment may be expressed as follows:

Default: media agent1: library1

Alternate: media agent 2: library 1

Alternate data paths are desirable as they provide additional means bywhich a storage operation may be completed and thus improve systemreliability and promote robust operation. Thus, the system mayautomatically select certain available alternate data paths tofacilitate load balancing and failover recovery. Such alternate datapaths may be specified using some or all of the additional routingresources available in the system. For example, the alternate data pathabove may specify as alternates some or all of the media agents in thesystem that are capable of communicating with library 1 and may beexpressed as follows:

Alternate: media agent 2; media agent 3; . . . media agent n: library 1

Where ‘n’ is the total number of media agents specified in the alternatedata path.

This arrangement allows the system to take advantage of other availablerouting resources, providing the ability to select from multiple datapaths to the desired destination. Using one approach, a storage policymay specify all of the media agents 105 in the system capable ofcommunicating with a particular destination (e.g., library 1). Thisprovides the greatest likelihood that a storage operation will becompleted, assuming that at least some storage resources are functioningor not otherwise congested.

Using another approach, some of the available routing resources, such asmedia agents 105, may be specified as alternates, providing a greaterlikelihood that the storage operation will be completed, rather thanrelying on a single alternate. Such alternate resources may be selectedbased on the degree of utilization, capacity, bandwidth, physicallocation, the desired confidence factor or other considerations and maybe specified manually or assigned automatically based on dataprotections goals specified for the system.

In addition, alternate data paths may be specified in many ways toprovide robust routing options. For example, alternate data paths may bespecified according to user preferences. A system administrator mayspecify certain alternate data paths and the priority and/or order inwhich the data paths are to be used. Another method for providingalternate data paths may involve using the “round robin” approach inwhich alternate data paths are selected from a group of available datapaths such that each alternate data path is selected and used before anypreviously used data path is selected and used again. This approach istypically useful in promoting load balancing within the system as ittends to spread out data transfer operations across available data pathsin a substantially uniform fashion. Other approaches may includespecifying alternate data paths to emphasize the ability to complete astorage operation in the event of a failover condition.

In certain embodiments, alternate data paths may be specified such thatdata from one client or sub-client may be routed to a particulardestination through substantially every available data path that maypotentially link the client or sub-client to the destination. Similarly,alternate data paths may be defined such that data is restored toparticular client, sub-client or computing device from some or allstorage devices within the system. This arrangement provides significantflexibility within the system for performing and completing both storageand restore operations.

Similarly, in some embodiments, alternate destinations may be specifiedand used in failover or other emergency data protection operations. Forexample, a storage policy may specify a data path including library 1,library 2, and others, with the provision that the specified mediaagents have access to each of the specified libraries. In someembodiments, media agents specified in such data paths may share anindex cache.

One benefit of the arrangement described above is the ability toconserve media within a storage system. In some embodiments, storagepolicies may not have the ability to share storage media due to certainconflicts within programming logic or the need for storage policies toresolve any such conflicts in mutually exclusive manner to ensurecomputational integrity. Thus, storage operations performed pursuant todifferent storage policies are generally required to write to differentmedia, often resulting in the inefficient use of media.

For example, a client may communicate to a storage device 115 through afirst media agent 105 pursuant to a first storage policy and a secondclient may communicate to the same storage device pursuant to a secondstorage policy and a second media agent. In this case, eachcommunication or storage operation by each media agent may be written todifferent media in the storage device due to programming constraints.Moreover, when a failover condition occurs, further communications tothe storage device may be written to a third media based on thealternate data path definitions, resulting in an even higher media usagerate.

An aspect of the present invention streamlines this process byspecifying data paths on a sub-client basis and creating a complimentarystorage policy based on this information to avoid the logical conflictdescribed above, or any other logical conflict that may exist. Moreover,this arrangement allows multiple clients (and associated sub-clients) touse the same storage policy, significantly reducing the number ofstorage policies required to manage the system as well as simplifyingthe process involved in updating or changing the policies themselves.This also facilitates updating and/or changing the client associationswith storage policies that control or otherwise specify particularsinvolved in data movement.

Additionally, two groups of clients may specify two sets of client orsub-client data paths (e.g., a default and alternate for each), but, maybe governed by a single storage policy in accordance with one embodimentof the present invention. This may be accomplished by examining the datapaths and combining or rearranging them into a suitable form for use inthe storage policy. For example, a first group of sub-clients mayspecify the following data paths:

Default data path: Media agent 1: library 1

Alternate data path: Media agent 2: library 1

The second group of sub-clients may specify the following data paths:

Default data path: Media agent 2: library 1

Alternate data path: Media agent 1: library 1

These may be examined and modified (or combined and rearranged) tospecify or point to a single storage policy with data paths as expressedbelow which takes into account the data path preferences of eachsub-client while eliminating the need for two separate storage policies:

Default data path: Media agent 1: library 1

Alternate data path: Media agent 2: library 1

In operation, the system may consult this modified storage policy(default first and alternate second) to obtain data path preferenceswhen moving data from the first set of clients. When moving data fromthe second set of clients, this storage policy may be consulted inreverse order, thus preserving the original preferences (i.e., Mediaagent 2 as the default with Media agent 1 as the alternate). Using thissingle storage policy arrangement, data from various storage operationsmay be written to the same media, rather than using separate media asexplained above, promoting media conservation. Moreover, specifying datapaths on a sub-client level allows multiple clients to write data to thesame media and avoids the potential logical conflicts described above.

Another embodiment in accordance with the present invention includes thecase where two client domains separated by a firewall, each domaincontaining multiple sub-clients. Assume, for example that each domainhas set of sub-clients with different data paths as shown below:

Domain 1:

Default data path: Media agent 1: library 1

Alternate data path: Media agent 2: library 1

Domain 2:

Default data path: Media agent 3: library 1

Alternate data path: Media agent 4: library 1

As in the example above, media agent utilization will increase if twostorage policies are used to manage this arrangement. Thus, inaccordance with an embodiment of the present invention, these data pathsmay be modified (or combined) into one storage policy set forth belowhaving four specified data paths rather than two storage polices withtwo data paths each, thus maintaining failover protection and promoteminimum media utilization:

Media agent 1: library 1; Media agent 2: library 1

Media agent 3: library 1; Media agent 4: library 1

When moving data from the first domain, the first entry is consulted andvice versa for the second domain which allows information from bothdomains to be written to the same media, promoting efficient mediautilization. This order of operations may be defined within the storagepolicy or may be specified by placing the appropriate pointers or otherreferential elements in an index or other entry that governs data pathpreferences.

Another benefit of the present invention includes the ability to use asingle storage policy to govern multiple clients. This provides userswith significant flexibility by allowing them to define a storage policyand “point” to that policy through referential elements to multipleclients, thereby simplifying system administration.

For example, in the case where one or more clients need to have changesor modifications made to an associated storage policy, with the providedarrangement, a single policy may be changed having a global effectrather than requiring a similar change be made to multiple individualpolices. Moreover, clients may easily be assigned or moved from onestorage policy to another merely by changing a pointer or otherreferential element. This eliminates having to copy, significantlymodify, change or create a new storage policy from scratch. Further,storage policies are no longer defined and associated on an individualclient by client basis.

In one embodiment, storage policies and associated storage domains maybe associated with one another based on system configuration, userneeds, or other considerations. This process may be performed eithermanually, automatically, or may be partially automated, requiringcertain user input such as customization information, intended orexpected use, etc. For example, at system setup a configuration programmay walk an administrator through a configuration program and prompt theuser for certain customization information. In alternative embodiments,this process may be predominantly or completely automated based oncertain specific goals including, but not limited to, efficient mediausage, degree of desired data protection, and substantially even and/orefficient load distribution.

Flowchart 400 of FIG. 4 illustrates some of the steps involved inassigning storage policies to system resources or in evaluating existingstorage polices for possible consolidation as part of an ongoing effortto analyze and increase system efficiency.

As shown, at step 410 any existing storage polices or defined data pathsfor performing storage operations may be retrieved, examined andanalyzed. This may involve, for example, retrieving path informationfrom an index cache associated with a media agent or master storagemanager or retrieving similar information from a metabase that may beassociated with such components. The analysis may include examining datapath information such as origination point (e.g., clients and/orsub-clients), destination point (storage device, library, media pool,etc.), transmission resources scheduled to be involved including mediaagents, data conduits and other transmission elements. In someembodiments, this may involve the creation of a system wide or morelimited process-based netlist to obtain a basic understanding of systemrouting options and transmission patterns and preferences.

At step 420, the system may determine whether any identified clients orsub-clients have a common destination point. The destination points aretypically defined as a storage device for receiving data from copyoperations representing the last location of data at the completion of aparticular copy operation. A list of origination points (e.g., clientsand or media agents) and common destination points may be compiled as astarting point to determine similarities between various identified datapaths that may be suitable for combination or rearrangement into one ormore storage policy to improve overall system efficiency and/or reducemedia consumption. Next, at step 430, media agents and other datatransfer resources may be associated with the list to generate a morecomplete picture of the routes and resources involved/available intraversing the data paths between origination and destination points. Atthis point, the netlist may be substantially complete taking intoaccount available routing and resource information.

At step 440, the system examines any pre-existing or identified storagepolices and compares them with other storage polices and the informationgenerated at step 430 to identify common elements that may be combinedor folded into the existing storage policies. This may also involveidentifying and comparing clients/sub-clients with common originationpoints and correlating them with storage polices having commondestination points as a basis for potentially creating new storagepolices. Other information of interest may include identifying commonmedia agents and associated destination points, etc. for similarreasons.

Next, at step 450, it may be determined whether any identified commonelements are precluded from writing information to the same storagedevice and/or storage media. If so, in some embodiments, these elementsmay be noted on a list of items not suitable for combination to intostorage polices and may be identified as needing individual treatment.The gathered information may be analyzed to determine if the number ofstorage polices may be reduced by combining common elements, bycombining or modifying existing storage polices, or by recasting storagepolices with other identified data paths into more efficient storagepathways (step 460).

For example, the analysis may reveal four existing storage polices thathave many common elements. Depending on the management goals of thestorage system, these four storage polices may be combined into onecomprehensive storage policy with a common destination if minimal mediausage is desired or may be combined into two storage polices to minimizethe possibility of alternate data path congestion.

Other analysis results may reveal several sub-client data pathdefinitions that can be combined into a new storage policy to reducemedia usage without substantially affecting storage deviceaccessibility. Moreover, although some results may suggest thecombination of significant numbers of storage policies or other commondata paths, such suggestions may be examined to determine whetheroverall system performance would be adversely impacted, for example,beyond a preset performance threshold, and if so, may not be implementedeven though such combinations may reduce overall media consumption.

In some embodiments of the invention, factors other than mediaconsumption or possible congestion may be taken into account whendetermining how to create, change or modify storage polices toaccommodate certain system management goals. Such considerations mayinclude load balancing, optimization, service level performance or otheroperational goals including adjustments to account for changes that mayoccur over time.

A system administrator, for example, may wish to maintain asubstantially even workload across the storage network and maintain thatdistribution on a going forward basis. Other goals may includemaintaining operational performance within a certain percentage level toensure a specified level of data protection or maximizing systemefficiency during peak usage periods. Achieving these and other goalsmay involve the dynamic and periodic redefinition of data paths andassociated storage policies

Turning now to FIG. 5, a flow chart 500 illustrating some of the stepsinvolved with the dynamic analysis and potential redefinition of storagepolicies/selection of alternate data paths in accordance with aspects ofthe present invention is illustrated. At step 510, client/sub-clientdata paths within the system are analyzed similarly to step 410described above in connection with FIG. 4. Next, at step 520, with thedata paths identified, certain system performance and forecastingreports may be run as described in co-pending, commonly assigned casesentitled Systems and Methods for Allocation of Organizational ResourcesApplication, and Hierarchical Systems and Methods for Providing aUnified View of Storage Information, Ser. No. 11/120,619, filed May 2,2005, which are hereby incorporated by reference in its entirety. Suchreports may forecast, based on past performance or other parameters, howresource utilization may grow or otherwise change and predict howcapacity, efficiency, failure rates, and traffic load may impact storageoperations over time.

Based on the forecasting information, the system analyzes, at step 530,data paths to identify which ones are susceptible to or likely toexperience an adverse impact due to the changing conditions (e.g., basedon predefined thresholds or resource capacity). Such data pathdefinitions or storage policies may then be modified on a dynamic basisto accommodate or otherwise account for predicted conditions to minimizeimpact (step 540). For example, if it is determined that certain datapaths are expected to become congested after a certain period of time,additional alternate data paths expected to handle the additional loadmay be added before that point is reached or other alternate data pathsthat do not suffer from the same conditions are specified.

If certain media agents that serve particular storage devices areexpected to become overloaded or constantly busy at or near capacity,additional alternate media agents may be added by combining or otherwisealtering storage policy data paths to help reduce the adverse impact ofthe anticipated problem (e.g., other alternate data paths not sufferingfrom the same or similar conditions may be selected or added to helpalleviate any detected or predicted problem).

Moreover, data paths may be changed on a dynamic basis to balance load,maintain a substantially constant data load, or prevent a failovercondition in accordance with user specifications or system requirements.In certain embodiments this may involve distributing work load acrossseveral communication paths as described in commonly assigned,co-pending case entitled Systems and Methods for Providing MultipathStorage Network, filed on Dec. 19, 2005, and which is herebyincorporated by reference in its entirety.

Next at step 550, resource reallocation is considered if data pathadjustment is not sufficient to correct or acceptably minimize anyanticipated problem. This may involve, for example, allocatingadditional storage resources such as media agents, data paths, andstorage devices, etc. from other storage operation cells, as describedin commonly assigned, co-pending case entitled Systems and Methods forMigrating Components on a Hierarchical Storage Network, application Ser.No. 60/752,196, filed on Dec. 19, 2005, and which is hereby incorporatedby reference in its entirety. If deemed helpful, the reallocation isperformed at step 560 as described in that case. The system may thenperiodically return to step 510 and perform the process as part of anongoing recursive effort to maintain or optimize system performance.

In some embodiments, prior to actual reallocation of resources, proposedreallocation scenarios may be simulated and evaluated with the expectedresults extended over time in order to choose the best solution to anyresource shortcoming that best fits enterprise needs or userexpectations. Moreover, in some embodiments, any such resourcereallocation may need to be approved by an administrator prior toreallocation, which may involve reviewing simulation results andapproving reallocations on a component by component or proposal byproposal basis. However, in other embodiments, such resourcereallocation may be performed substantially automatically.

Next at step 570, the system monitors performance subsequent to resourcereallocation to help confirm the reallocation is providing the desiredeffect. This may involve monitoring the operation of the actualreallocated resources and/or the system components or processes thereallocation was intended to benefit. If actual operation of the systemis not in accordance with expectations and/or simulation results, thesystem may be quiesced, and the original configuration returned until ananalysis may be performed to determine why expected results were notachieved.

In one embodiment, a trouble ticketing system or other notificationsystem, as is known in the art, may be activated to notify theadministrator of the failed reallocation. Moreover, in some embodiments,the level of performance may be examined to determine if thereallocation is having the expect level of desired effect. For example,if a particular reallocation is operating within a certain percentage ofexpectations (e.g., 80%), which may be user defined, the reallocationmay be considered acceptable. If not, the reallocation may be consideredunacceptable, and the system configuration may be returned to its priorstate (automatically or upon user approval).

In some embodiments, the system may monitor or log some or all resourcereallocations and subsequent associated performance changes so that thechanges may be continually evaluated, used as a basis or model forfuture changes, and as a basis for returning some or all of the systemto prior configurations. Moreover, this information may act as atemplate for future system provisioning and deployment and evaluatingthe operation of selected system software or hardware components.

Systems and modules described herein may comprise software, firmware,hardware, or any combination(s) of software, firmware, or hardwaresuitable for the purposes described herein. Software and other modulesmay reside on servers, workstations, personal computers, computerizedtablets, PDAs, and other devices suitable for the purposes describedherein. Software and other modules may be accessible via local memory,via a network, via a browser or other application in an ASP context orvia other means suitable for the purposes described herein. Datastructures described herein may comprise computer files, variables,programming arrays, programming structures, or any electronicinformation storage schemes or methods, or any combinations thereof,suitable for the purposes described herein. User interface elementsdescribed herein may comprise elements from graphical user interfaces,command line interfaces, and other interfaces suitable for the purposesdescribed herein. Screenshots presented and described herein can bedisplayed differently as known in the art to input, access, change,manipulate, modify, alter, and work with information.

While the invention has been described and illustrated in connectionwith preferred embodiments, many variations and modifications as will beevident to those skilled in this art may be made without departing fromthe spirit and scope of the invention, and the invention is thus not tobe limited to the precise details of methodology or construction setforth above as such variations and modification are intended to beincluded within the scope of the invention.

1. (canceled)
 2. A method for consolidating storage policies within astorage operation network, the method comprising: accessing a firststorage policy of a plurality of storage policies, the first storagepolicy comprising at least a first storage operation that transfers datafrom at least a first computer to a first storage device with a firststorage operation path; accessing a second storage policy of theplurality of storage policies, the second storage policy comprising atleast a second storage operation that transfers data from the firstclient to the first storage device with a second storage operation paththat is different than the first storage operation path; automaticallyevaluating with one or more computer hardware processors, the first andsecond storage policies to determine that both the first and secondstorage policies use both the first and second storage operation pathsto conduct data from the first client to the first storage device;consolidating the first and second storage policies into onecomprehensive storage policy, wherein the comprehensive storage policyassociates at least the first and second storage operation paths to thefirst computer and the first storage device; and automatically adding atleast a third storage operation path to the comprehensive storage policybased on one or more network operating conditions predicted to occur,wherein the third storage operation path is an alternate data path thatis different than the first and second storage operation paths.
 3. Themethod of claim 2 wherein a first media agent transfers the data via thefirst storage operation path and a second media agent transfers the datavia the second storage operation path.
 4. The method of claim 2 whereinthe first computer comprises at least one sub client.
 5. The method ofclaim 2 wherein the one or more network operating conditions comprise atleast one of the group consisting of: data transfer rate, network usage,load balancing, resource exhaustion, transmission congestion, orperformance optimization.
 6. The method of claim 2 wherein automaticallyevaluating the first and second storage policies is based at least inpart whether the first and second storage policies share a commonelement.
 7. The method of claim 2 wherein information about the firstand second storage operation paths is obtained from an index cache. 8.The method of claim 2 wherein information about the first and secondstorage operation paths is obtained from a metabase.
 9. The method ofclaim 2 wherein automatically evaluating the first and second storagepolicies is based at least in part on one of the group consisting of:origination point, destination point, transmission resources scheduledto be involved, and a process-based netlist.
 10. The method of claim 2wherein automatically evaluating the first and second storage policiesis based at least in part on which storage operation paths are likely toexperience an adverse impact due to changing conditions.
 11. The methodof claim 2 wherein dynamically adding the third storage operation pathis based at least in part on preventing a failover condition.
 12. Astorage operation system comprising: a plurality of storage devices; aplurality of storage policies, wherein a first storage policy comprisesat least a first storage operation that transfers data from at least afirst computer to a first storage device with a first storage operationpath, and wherein a second storage policy comprises a second storageoperation that transfers data from the first computer to the firststorage device with a second storage operation path that is differentthan the first storage operation path; and a storage managementcomponent executing in one or more computer hardware processors, thestorage manager configured to: automatically evaluate the first andsecond storage policies to determine that both the first and secondstorage policies use both the first and second storage operation pathsto conduct data from the first client to the first storage device;consolidate the first and second storage policies into one comprehensivestorage policy, wherein the comprehensive storage policy associates atleast the first and second storage paths to the first computer and thefirst storage device; and automatically add at least a third storageoperation path to the comprehensive storage policy based on one or morenetwork operating conditions predicted to occur, wherein the third datapath is an alternate data path that is different than the first andsecond storage operation paths.
 13. The system of claim 12 wherein thestorage management component is configured to direct a first media agentto transfer the data via the first storage operation path and a secondmedia agent to transfer the data via the second storage operation path.14. The system of claim 12 wherein the first computer comprises at leastone sub client.
 15. The system of claim 12 wherein the one or morenetwork operating conditions comprise at least one of the groupconsisting of: data transfer rate, network usage, load balancing,resource exhaustion, transmission congestion, or performanceoptimization.
 16. The system of claim 12 wherein the storage managementcomponent is configured to automatically evaluate the first and secondstorage policies based at least in part on whether the first and secondstorage policies share a common element.
 17. The system of claim 12wherein the storage management component is configured to obtaininformation about the first and second storage operation paths from anindex cache.
 18. The system of claim 12 wherein the management componentis configured to obtain information about the first and second storageoperation paths from a metabase.
 19. The system of claim 12 wherein thestorage management component is configured to automatically evaluate thefirst and second storage policies based at least in part one of thegroup consisting of: origination point, destination point, transmissionresources scheduled to be involved, and a process-based netlist.
 20. Thesystem of claim 12 wherein the management component is configured toautomatically evaluate the first and second storage policies based atleast in part on which storage operation paths are likely to experiencean adverse impact due to changing conditions.
 21. The system of claim 12wherein the storage management component is configured to dynamicallyadd the third storage operation path based at least in part onpreventing a failover condition.